Structural Fluctuation of Protein in Water around Its Native State: A New 

Statistical Mechanics Formulation 



Bongsoo Kim^ and Fumio Hirata^ 

^ Department of Physics and Institute for Soft and Bio Matter Science, 
Changwon National University, Changwon 641-773, Korea 
^ College of Life Sciences, Ritsumeikan University, Kusatsu, Shiga 525-8577, Japan 

(Dated: October 31, 2012) 

A new statistical mechanics formulation of characterizing the structural fluctuation of 
protein correlated with that of water is presented based on the generalized Langevin equation 
and the 3D-R1SM/RISM theory of molecular liquids. The displacement vector of atom 
positions and their conjugated momentum, are chosen for the dynamic variables for protein, 
while the density fields of atoms and their momentum fields are chosen for water. Projection 
of other degrees of freedom onto those dynamic variables using the standard projection 
operator method produces essentially two equations which describe the time evolution of 
fluctuation concerning the density field of solvent and the conformation of protein around an 
equilibrium state, which are coupled with each other. The equation concerning the protein 
dynamics is formally akin to that of the coupled Langevin oscillators, and is a generalization 
of the latter, to atomic level. The most intriguing feature of the new equation is that 
it contains the variance- covariance matrix as the "Hessian" term describing the "force" 
restoring an equilibrium conformation, which is the second moment of the fluctuation of 
atom positions. The "Hessian" matrix is naturally identified as the second derivative of the 
free energy surface around the equilibrium. A method to evaluate the Hessian matrix based 
on the 3D-RISM/RISM theory is proposed. Proposed also is an application of the present 
formulation to the molecular recognition, in which the conformational fluctuation of protein 
around its native state becomes an important factor as exemplified by so called "induced 
fitting" . 

I. INTRODUCTION 



Structural fluctuation of protein around its native state plays essential roles in a variety of processes 
in which the biomolecule performs its intrinsic function [1]. For example, so called "gating" mech- 
anisms of ion channels are regulated by the structural fluctuation of amino-acid residues consisting 
the gate region of the channel. Molecular recognition such as the formation of an enzyme-substrate 
complex in an enzymatic reaction is controlled often by structural fluctuation of protein. Few typi- 
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cal examples of structural fluctuations around a native conformation of protein, related to function, 
are "breathing "hinge-bending" [3^, and "arm-rotating" motions [4]. Those motions are collective 
in nature involving many atoms moving in the same direction. Those structural fluctuation associ- 
ated with protein functions, whether it's large or small, stays within its native conformation, and 
does not induce global conformational change such as denaturing, with few exceptions exemplified by 
intrinsically disordered protein Q]. 

In actual biological processes, solvent plays vital roles both in the equilibrium and in fluctuation of 
protein It may not be necessary to spend many words for emphasizing the crucial role played by 
solvent for stabilizing or destabilizing native structure of protein, such as the hydrophobic interaction 
and hydrogen bonds. Here, let us consider roles played by water in fluctuation of protein around 
its native conformation, associated with recognition of a ligand by protein. The process is primarily 
a thermodynamic process, governed by the free energy difference between the two states before and 
after the recognition. It is obvious that water plays crucial role in the thermodynamics, since the 
equilibrium structures are determined by the free energies including the excess chemical potential or 
the solvation free energy of water. However, it is not the only role of water in the process. Water 
actually regulates the kinetic pathway of the process as well by controlling the structural fluctuation 
of amino-acid residues consisting the active site. An example of such processes is a mouth-like motion 
of amino-acid residues. The open-and-close motion of the mouth is driven not only by the direct 
force acting among atoms in protein, but by that originated from the solvent induced force which 
is in turn caused by the fluctuation in the solvation free energy, or the non-equilibrium free energy. 
In an actual biomolecular process, such conformational change around the native state is induced 
often by some perturbation upon amino-acid residues around the active site, for example, binding of 
a ligand. However, response to the perturbation should be linear, because the protein recovers its 
native conformation upon removing the perturbation [7]. 

It is not surprising that considerable efforts have been devoted to clarify the conformational fluctu- 
ation of protein theoretically, which has started at the end of the last century based on the molecular 
mechanics or dynamics. One of earliest attempts was to relate the structural fluctuation to the normal 
mode of protein [8]. Those works have demonstrated the importance of the collective mode in the 
fluctuation. However, those efforts have not provided a realistic physical insight into the dynamics 
of actual biological processes, since they are concerned with a protein in "vacuum", which obviously 
cannot describe the fluctuation correlated with that of solvent. The principal component analysis 
involving diagonalization of the variance-covariance matrix of conformational fluctuation, extracted 
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from the molecular dynamics trajectory of a protein in water, has revealed some important aspects 
of the conjugated fluctuation between a biomolecule and water Qj. The lowest frequency mode of 
fluctuation around a native conformation exhibits an activated transition from a minimum to another 
minimum in the conformational space, akin to the jump diffusion model of liquids. However, the pro- 
cedure cannot be extended readily to that associated with such a process as ligand binding, because 
the process is concerned with sampling of large configuration space involving both protein and solvent. 
It becomes formidable especially when the solvent consists of several chemical components such as the 
electrolyte solution. 

In the present work, we propose a new first-principle approach to treat structural fluctuation of 
protein conjugated with that of solvent, based on the two theoretical frameworks in the statistical 
mechanics of liquids, or, the 3D-RISM/RISM (Reference Interaction Site Model) theory and the 

The 3D-RISM/RISM theory 



generalized Langevin equation 



111 has proven itself to be capable 

n 



of predicting the molecular recognition of ligand by protein which has a rigid structure [13]. The 
generalized Langevin equation should be able to describe the fluctuation of a system consisting of 
protein and solvent around its equilibrium state. Therefore, it is reasonable to expect that the two 
theories combined together will produce a method which can describe the molecular recognition process 
2y protein, whose structure is fluctuating. An etude of such a theory has been already published by us 



131 ] where a much more simplified model, a chain of identical particles in solvent consisting of spherical 
molecules, was considered. The key idea there lies in the choice of dynamic variables. We have chosen 
four quantities to form a vector in the phase space: the displacement of atom positions in protein from 
their equilibrium coordinates, the conjugated momentum of those atoms, the fluctuation of the density 
field of solvent molecules, and their conjugated momentum field or flux. A standard treatment of the 
dynamic variables due to the projection operator method [ij] gave rise to four equations with 
respect to the time evolution of those quantities, two for solute and two for solvent, which interplay 
with each other. Most important observation in the results is that the equation of motion concerning 
the solute dynamics includes the variance-covariance matrix regarding the conformational fluctuation 
of solute as a "Hessian" or a "force constant" of the "oscillation" or fluctuation. Here, we generalize 
..eo.. developed i„ «.e p.ecedin, pape. Q s...U„«aI„ i.. o.de. to .e aMe to ... a .ea,i.ic 
protein in a realistic solvent such as water. 
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II. PROJECTION OPERATOR METHOD: SUMMARY 



Since the projection operator method is well-known 15l4l7l]. we here only summarize the general 
results of the method. It gives the time evolution equation of a dynamic variable A{t) which is a 
function of microscopic variables. Its microscopic time evolution is governed by the Liouville operator 
iC whose expression will be given in the next section: 

^ ,CA{t) = {A, H}pB{t) (II.l) 

where {a, 6}pb is the Poisson bracket, and T-L is the Hamiltonian of the system. The formal solution 
of (III.1|1 is given by 

A(t) = e^^*A(0) = e*^*A (II.2) 

Now the projection operator V is defined as 

P(---) = (A,---)(A,A)-^A (II.3) 

The inner product (a, b) denotes an average of the canonical distribution exp ( — T-L/ksT): 

(a, b) = (a*b) = Z-^ J dr a*b exp ( - ^(F) /^bT) (II.4) 

where T denotes all microscopic degrees of freedom in the system. The operator projects out only the 
'component' of A from the object (• • •). Then obviously VA = A holds. It also has the idempotent 
property = V. 

After projecting A-component out of the microscopic degrees of freedom, the exact time evolution 
equation for A(t) is given by 



dA{t) 



m ■ A{t) - t ds K(t - s) ■ A{s) + f (t) (II.5) 

JO 



dt 

Here the frequency matrix ifi, the memory matrix K(t), and the fluctuating force vector f(t) are given 
by 



in = (A,A)-(A,A)-\ 



K(t) = (f,f(t)).(A,A)-\ 

f(t) = exp - (1 - P) A (II.6) 

One can show easily that the fluctuating force f(t) does not have A-component, i.e., (A,f(t)) = 0. 
Using this feature and the linearity of the equation, we immediately obtain the following dynamic 
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equation for the auto-correlation function of A{t), C{t) 

= in ■ C{t) - /* ds K{t - s) ■ C{s) (11.7) 
at Jq 



III. GENERALIZED LANGEVIN EQUATIONS FOR A SOLUTE-SOLVENT SYSTEM 

Our main concern here is a protein-water system at infinite dilutions. However, the formulation 
is completely general for any solute-solvent system at infinite dilution. So, in the formulation, we 
consider a general solute-solvent system. In particular, we consider a solute molecule consisting of Nu 
atoms immersed in solvent consisting of molecules, each having n atoms. The Hamiltonian of the 
solute-solvent system is then given by 

H = Ho + Hi +^2) 

N n a a 

^0 = E E + E E ^o(l< - r^D] (solvent) 

i=l a=l " j^i b^a 

^1 = E [-^M^ + E - %!)] (solute) 

Nu N n 

H2 = E E E - I) (solute-solvent) (III.l) 

a=l i=l a=l 

where Mq denotes the mass of the ath atom in the solute particle, and nia the mass of ath atom in a 
solvent molecule. The Hamiltonian of the solvent is denoted by Ho where rf and pf are respectively 
the position and momentum of ath atom in the ith molecule of the solvent, and Uo{rfj) {rfj = |r" — r^|) 
is the pair potential energy between them. Hi is the Hamiltonian of the Nu solute atoms, and 
and Pa are the position and momentum of the ath solute atom (we preserve the Greek indices for 
denoting the solute atoms), and [/«„*( I R-a ~ |) is the interaction potential energy between the ath 
solute atom and the ath atom of the ith molecule in the solvent. 
The associated Liouville operator iC is given by 

iC = iCo + iCi, 



. , Lm„"' dxl ^.j-' drf dpf ^ drf dpf 



rP r) 8 1 

where Fq, = F^"^ +Fa \ and F^"^ is the force exerted on the ath solute atom by the other solute atoms, 
F^'' the force exerted on the same solute atom by the solvent molecules. Their explicit expressions 
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are given by 



■E 



dUi{R, 



5Rn 



(IIL3) 



1=1 a=l 



Our dynamic variable A(t) is chosen to be 



/ 



(III.4) 



AR,(t) \ 
Pa(t) 

V JkW y 

Here ARq, (t) is the displacement of the position vector R^ of the a-th solute atom from its equilibrium 
value. And 5p^^{t) is the Fourier component of the density fluctuation 6p"'{r,t) = p°'[r,t) — (pg is 
the average number density of the ath atom) of the solvent liquid: 

i 

5pl{t) = I dre^i^-J/lr, t) =J2 ^"^'^^^^'^ " i'^^f Pim (ULS) 

i 

Likewise, Jk(^) is the Fourier component of the current of the ath atom in the solvent liquid: 



mo 



(III.6) 



Balucani and Zoppi [15] have worked out the special case where only the momentum of a solute particle 
was chosen as a dynamic variable. 

We now proceed to obtain the specific expressions of the eqs. (|lL5l) or dTLH). First one has to 
compute the correlation matrix (A, A) and its inverse (A, A) ^. The inner product denotes average 
over the canonical distribution exp ( — fH-LiV)) with /? = \/{kBT): 



(a,b)^(a*b) = iy' dra*(r)b(r)e- 
where Z is the partition function Z = /dFexp ^ — pi-LiT)^. 



(III.7) 
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A. The correlation matrix (A, A) 



The correlation matrix C = (A, A) is given by 

^(AR„,AR^) (Pa,AR^) {6pi,AKp) {3'^,AK^)\ 

^^^^^ (AR,,P/3) (Pa,P/3) (5pLP/3) (Jk>P/3) 

{An^,6pi) {Pa,Spi) {6pi,Spi) {3i,6pi) 

[ (AR„,4) (p,,4) i6pi,Ji) (j£,j^)y 

We first identify the vanishing elements. The following elements vanish: 



(IIL8) 



(AR„,P;3) 
(P«,AR;3) 

(5p£,P/3) 
(J£,AR;3) 



0, 



(AR„,4) = 



0, {P^,6pi) = 0, 



(Pa) Jk) 



0, 



0, (5p^,4) = o, 



They vanish since the momentum integrations 



J dp"^prexp(-/3 5:5: 



2ma ' 



(IIL9) 



0, 



y dP^" Po exp ( - /3 ^ P^ /2M^) = 0. 



We now look at the nonvanishing elements. The momentum correlation of solute particles is easy 
to compute: 



(Pa,P/3) 



1 



dP^"P„P^e-''^.^^/'*'- 



= ^ 1 dP^" P,( - M^ksT)^^ g-/3E,P^/2M, ^ A;bTM„1,5«^ (III.IO) 

where Zp = J dP^" exp ( - /3 P7/2^7)' ^nd 1 is the unit (3 x 3) matrix. The eq. (|111.10l) is 
nothing but the equipartition theorem. 

Since the general current-current correlation function (J^, J^) will have non- vanishing correlation 
between the same Cartesian components only, it is sufficient to define the current-current correlation 
function as 



Jab{k) = -^<^J"k ■ Jk 



(III.ll) 



Its calculation is somewhat involved: 
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-i'k-ir'^-r''. 



n 



(III.12) 



A general expression of this quantity is given in Eq. (7) in [18]. 

The remaining elements (ARq, AR^), (ARq,,(^p^), ((^p^, AR^), and {6p^,6p]J involve the spatial 
coordinates only. We consider them in order. In the present work we will not specify particular form 
for correlation of initial position of solute particles since here we are interested in laying out general 
structure of the dynamics. We first have the displacement correlation matrix for the solute particles 



= AR„, AR;3 



(III.13) 



where Jjap is a {3Nu x 3A^„) matrix. We next consider (ARq,, (5/3^). First note that (AR^^p^) 



{AKapi) - (27r)Vo5(k)(ARa) = {AKaPi) since (AR„) = 0. Therefore 



Bf^{AK^,5pi) = {An^pi) 



(III.14) 



In Appendix A, we show that this quantity and its transposed one vanish; 



B„,6(k) = B„,;3(k) = 



(III.15) 



Finally, we have the static structure factor of solvent molecular liquid defined as 



Xa6(k) = ^i6pl,dpi) = ^{6pl^5pi 



fA,A) 



(III.16) 



This can be calculated using the RISM theory. 

Summing up the above results, we have the following block-diagonal matrix for (A, A) 

L„/3 O o\ 

O ksTAUU^p 

0^ 0^ Nxab{\^) 

V 0^ 0^ 0^ NJab{k) J 

where O denotes the {3Nu x 3Nu) zero matrix, the (3A'"„ x n) zero matrix, the (n x n) zero matrix, 
and the superscript T the transpose matrix. 



(III.17) 
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B. Inverse of (A, A) 



Since the above correlation matrix is block-diagonal, it is trivial to obtain the inverse (A, A) ^ as 

/ (j-^) . o o\ 






(A,A)-^ 



o 



0^ 



0^ 



V 



0^ 0^ jrJab\k)J 

Here the inverse matrices (L"-*^)^^, Xab(^) "^abi^) defined as 



(III.18) 



La-y(L ^)^^ = 15a(3, YXac{k)Xch{k) = Sab, Jac{k)jJ{k) = 5ab- 

7=1 c=l 



c=l 



C. The frequency matrix ift 



Here we compute the frequency matrix ift which is defined as 

inA.^E(AA',A;,)[(A,A)-l],,^ 
A' 

We first look at the elements of the matrix (A, A): 

/ (AR,,AR^) (Pa,AR^) (5p£,AR^) (J£,A%)A 
(AR,,P^) (Pa,P^) (5p£,P/3) (J£,P/3) 
(AR,,p^) {P^,pi) {6plpi) {31 pi) 

V (AR,,4) (Pa, 4) (^p^jk) (j£,jk); 

First we obtain some elements of A using the Liouville operator ()ni.2p . 



(A, A) 



ARq, = iCAKa 



Pl = iCpi = ik-Ji 



(HI.19) 



(HI.20) 



(HI.21) 



where is the total force exerted on the ath solute particle by the solvent as well as by other solute 
particles. Actually when we compute the elements involving P or J^, it is more convenient to use the 
integration by parts. It is useful to remember that whereas AR and involve single momentum (P 
or Pi), P and involve zero (since pf is the force acting on the ath atom of the ith molecule, which 
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only involves the positions of solute particles and solvent molecules), or two momentums (two pj). 
Using this fact, we can easily identify the vanishing elements: 



(AR„,AR^) 

(Pa,P/3) 

(Jk,AR/3) 
The nonvanishing elements are 

(AR„,P/3) 
(P„,AR;3) 
('^Pk,jk) 

Taking all these into account, we obtain 

O ksTldap 



0, (AR„,/)^) = 0, (AR„,j^)=0 

0, (P,,p£) = 0, (P,,j^) = 0, 

0, (5p£,P/,) = 0, {6pl,pi) = G, 

0, (J£,P/3) = 0, (JLjk) = o 



M 



(P„,P^) = -keTUa 



M 



(J£,4)-ik = iiVkJ,b(A:) 



in 



V 



-ksTlSap 
0^ 
0^ 



O 

0^ 



\ 



iNliJabik) 

iNkJabik) / 



fA,A 



,-1 



in 



Using the inverse correlation matrix pil.isp . we compute in as 

/ O j^l6^^ 0\ 

-fcBr(L-i)^^ o 

0^ 0^ iMab 

V 0^ 0^ i'i^Ec=iJac{k)x-bHk) 0/ 



(III.22) 



(III.23) 



(III.24) 



(III.25) 



D. The reversible part 



From pi.sp . the reversible part of the Langevin equation is given by in ■ A{t). Using pil.25p . we 
obtain 

/ Pa(i)/M„ \ 

ik.J£(t) 

V ^^Eb,cJacik)x;b\k)5pi{t) J 



in ■ A{t) 



(III.26) 
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E. The fluctuating force 



The fluctuating force at i = from (jII.6P is given by 

f = {1-V)A = A-in- A 
where we used VA = (A, A) • (A, A)^^ A = ifl ■ A. We flrst obtain 

A = 



Pa 

V j£/ 



and 



in ■ A 



(III.27) 



(111.28) 



(111.29) 



ik.J£ 

which is obtained by setting t = in pil.26p . Using the above two results, we obtain for the fluctuating 
force as 







m = e 



it(l-V)C 





V 3k/ 



where 



W„ = + fcsT^ (L-i)^^ • A%, B^l^r^- ik^ J,,(fc)x,V(^)'^Pk 

/3 b,c 



(III.30) 



(III.31) 



F. The memory matrix 



The memory function matrix K(t) is calculated as 
K(t) = (f,f(t))[(A,A)-i] 



/ O O q\ 

o OT:(Wa,w^(t)) jjY.bJabm'ElWpit)) 

0^ 0^ 

W M4T(Wa,Ht(t)) jjY.cJac'm^'L.^iit)) I 



(III.32) 
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where W(t) = exp (it{l - P)£)w and H^(t) = exp (it{l - ■P)>c)H£. 

In ()III.32p . the two terms exhibit exphcit A^-dependence. In the thermodynamic hmit in 
which N is taken to be infinite while the number of solute particles remains finite, the term 
J^"^(A;)(H^, W(t)) will vanish since the ensemble average (Hk,W(t)) will remain finite. The 
other term •^ac"'^(^)(^k5 '^k(O) '^i^^ '^ot vanish since the ensemble average (S^,H^(t)) is propor- 

tional to N . Therefore only the latter term survives in the thermodynamic limit. 

In Appendix B, we show that 



(W,,H^(t))=0 

Therefore the final expression of the memory matrix is given by 

/ O O 

O OT(W.,W^(t)) 



K(t) 



0^ 

T 









0^ 

0^ j,Y.cJac'm^i.^i{t)) j 



(III.33) 



(III.34) 



G. The explicit form of the exact dynamic equations 



With the explicit results of the previous sections, we here write down an exlicit form for the time 
evolution equation pi.SP 



dt 

dPajt) 
dt 

dt 



-kBTj2 i^'Xf, ■ AR^t) - /J ds ^ T^^it - s) ■ ^ + W„(t), 



ik.J£(t), 



dt 



kE Jacik)x-tik)6pi{t) - 1 E J-,\k) f ds m^{t - s) ■ + H£(t) 



(III.35) 



In the above set of dynamic equations for the solute and solvent molecules, the random forces take 
the following forms 
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SUt) = e^'^'-^^^{3i-ikJ2Jac{k)x-,Hk)6pi) (111.36) 



b,c 

The memory functions in OILSSh are given by the time correlations of the random forces; 

^ I ' -•■ ,^^\ - 'bc/j.\ /•^b/j.\'^c 



TaMt) = j^{W^{t)Wp{0)), Mi'it) = (Si{t)S%{0)). (III.37) 

IV. DISCUSSIONS 
A. Solvent dynamics 

When the fluid is far from protein, or in bulk, where perturbation from protein vanishes, the last 
two expressions concerning solvent in Eq. (III.35) reduce to the equations for pure-water dynamics. 



19( 1 , except for an approximation made in the factor 



Eqs. (24 ) and (25), derived by one of the authors 
Jab [201 ] ■ The equation is further simplified to produce the site-site Smoluchowski-Vlasov (SSSV) 
equation if one makes the memory function local in time as well as in space. The equation can 
be analytically solved by means of the Laplace transform to produce the van Hove or space-time 
correlation function of water, with the input of the site-site pair correlation functions of the solvent 
obtained from the RISM theory. The theory has been successfully applied to a variety of solvent 
relaxation processes induced by an abrupt change in the electronic structure of a solute molecule, or 
solvation dynamics, which can be probed by the dynamic Stokes-shift |2l|. So, the two equations 
concerning solvent in (III. 35) can be regarded as a generalization of the previous theories developed 
for pure water to that subject to the field exerted from protein atoms. There are several remarks to be 
made with respect to the generalization. Firstly, the translational invariance of the system is no longer 
valid. Therefore, the equations should be solved in three-dimensional Cartesian-space. Secondly, the 
factor Xa6(r, r') = N~^{6p°'{r)6p^{r')) appearing in the equation is a two body density correlation 
function, but subject to the "external force" due to protein. Such a theory for obtaining the function 
is under development, but it is too primitive at the moment to be applied to the problem we are 
facing. Therefore, we may adopt the superposition approximation {6p°'{r)6p^{r')) = {6p°'{r)){5p^{r')) 
to this case. Then, (5p"(r)) can be readily evaluated from the 3D-RISM theory. 

A number of possible applications of the dynamic equations for the solvent are conceivable. An 
interesting example is the current-current correlation function (J"(r, 0) J^(r', t) > of water and ions in 
a molecular channel, which is concerned with many observables including the permeability of water 



and ions across the cell membrane 



The equation for the correlation function can be readily 



obtained by coupling the two equations for solvent with the aid of the 3D-RISM/RISM theory. 
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B. Solute dynamics 

The first two equations in Eq. (III. 35) concerning solute dynamics are combined together to resuh 

in 

^ /^^ait) ^ ^ _ ^^^^^^ ^ ^ ^^^^^ _ . ^ ^^^^^ 

The equation is regarded as a generalization of the equation for a coupled set of Langevin-oscillators, 
first examined by Wang and Uhlenbeck 23], to a realistic model of protein in water. Wang and 
Uhlenbeck proposed a model in which a coupled set of oscillators consisting of spherical beads is 
immersed in a viscous liquid, and applied the Langevin theory to the oscillators. Later on, Lamm 
and Szabo [24;] performed a normal mode analysis on the Wang-Uhlenbeck oscillators, assuming a 



26| applied the Langevin mode 



phenomenological friction term. Kottlam and Case |25|, and Ansari 
method of Lamm and Szabo to proteins. The same method was also applied to the dynamics of 
DNA and RNA [28] in solvents. A review on the normal mode analysis in general (including the 



Langevin mode analysis) in the dynamics of biomolecules is presented in [29]. 

There are several comments to be made on the new equation (jIV.ip . Firstly, the equation does 
not include a term related to the force which originates from the first derivative of the free energy 
surface with respect to the position. The force acting on an atom of protein comes from the three 
contributions, one which is proportional to the displacement of the atom from its equilibrium position 
(the second term in the left hand side in OV.ip ). and the friction term proportional to the velocities 
of the atoms, and that due to the random force (the term in the right hand side in pv.ip ). The 
physical origin as to why the equation does not include the first derivative of the free energy lies in 
our treatment based on the generalized Langevin theory. The whole idea of the generalized Langevin 
theory is to project all the degrees of freedom in the phase space onto few dynamic variables under 
concern. The projection is carried out using a projection operator, defined by Eqs. (IL3) and (n.4), 
in terms of an ensemble average of two variables which are fiuctuating around an equilibrium in the 
phase space. Obviously by definition, the ensemble average of the displacement of atoms in protein 
should be zero in equilibrium. 

Such a force as the first derivative of the free energy, which may cause the complete shift of the 
equilibrium, is not included in the treatment. The situation is somewhat analogous to the case of a 
harmonic oscillator, in which an oscillator swings back and force around a minimum of the harmonic 
potential. Only force acts on the system is the restoring force proportional to the displacement from 
the potential minimum. In our case, too, only force acting on the protein atoms is the one which 
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restores atom positions from fluctuating to equilibrium ones. However, there is an essential difference 
in physics between the two systems. The equilibrium position of a harmonic oscillator is the minimum 
of mechanical potential energy, while that of protein in water is the minimum in the thermodynamic 
potential or the free energy, which is concerned not only with energy but also with the entropy both 
of protein and of water. So, in the case of protein in water, the stochastic character of the dynamics 
is attributed not only to the random force term, but also to the conformational fluctuation of protein 
around its equilibrium state, induced by solvent, while the stochastic character is resulted just from 
the random force term in the case of the coupled harmonic oscillators treated by Wang and Unlenbeck. 

The argument above suggests interesting physics implied in Eq. pV.ip . and its application to 
biological functions. If one ignores the friction and random force terms in Eq. OV.lh . one gets 

This equation can be viewed as a coupled set of "harmonic oscillators" , whose "Hessian" matrix is given 
by ^^^(L^^)^,^. Considering Eq. (HI. 13), the "Hessian" matrix is related to the variance-covariance 
matrix of the positional fluctuation by 

ksTL-^ = kBT(AK AkV^ (IV.3) 



The observation strongly suggests that the dynamics described by Eq. ()IV.2p is that of fluctuation 
around a minimum of the free energy surface consisting not only of the interactions among atoms 
in the protein, but of the solvation free energy. In this respect, the configuration corresponding to 
the free energy minimum is not just one but an ensemble of distinguishable configurations concerning 
protein and solvent, which can be converted among each other due to the thermal noise. The free 
energy surface can be given by 

F({AR}) = [/({AR}) + A^({AR}) (IV.4) 
where C/({AR}) is the interaction potential energy among atoms in a protein, and A/i({AR}) is the 



solvation free energy of protein whose conformation is {R} [3C|. 

The above consideration further suggests a method to evaluate the variance-covariance matrix, 
which characterizes structural fluctuation of protein, based on the 3D-RISM theory. The variance- 
covariance matrix is closely related to the Hessian matrix, Eq. pv.3p . and the Hessian matrix is the 
second derivative of the free energy surface, namely. 
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Since the free energy F({AR}) can be obtained by solving the 3D-RISM/RISM equation, Eq. pV.Sp 
provides a way to evaluate the variance-covariance matrix. 

The variance-convariance matrix is by itself quite informative for characterizing the structural 
fluctuation of protein around its native state in atomic detail. As an example, let us consider a hinge- 
bending motion of protein. The variance-covariance matrix should have a structure in which a block 
of elements (ARq,AR^) for atom pairs, a, /?, belonging to the two sides of the hinge-axis, have the 
negative sign because the direction of the displacements ARq and AR^ is opposite. 

Usefulness of the variance-covariance matrix is not limited to characterization of the structural 
fluctuation around an equilibrium state. The Eq. pV.Sp implies that the free energy of protein at an 
equilibrium conformation takes the form 

F({AR}) = hsTY^AR^ . (L-i)^^ ■ AR^. (IV.6) 

In the presence of a small perturbation due to, say, ligand binding, the above free energy can be 
changed due to the perturbation as 

F({ AR}) = \kBTj2 ARa • (L"')„;3 • AR-/? " E ^^c. ■ fa (IV.7) 

where fa is the force acting on the ath protein atom due to the perturbation. Then, the conformational 
change due to the perturbation can be determined by the variational principle 

dF 

0. (IV.8) 



With Eq. (HEIl), Eq. (Ura gives 

(aR,)^ = {knT)-' J2 (ARaAK/j) ■ f/3 (IV.9) 

where the subscript 1(0) denotes the presence (absence) of the perturbation. Therefore, Eq. pV.5P 
combined with Eq. (|IV.9P provides a theoretical basis for analyzing the conformational relaxation of 
protein in water due to a perturbation such as ligand binding. Th Eq. (|IV.9p is first derived by Ikeguchi 
et. al. 0] based on the linear response theory. 

The equation pV.ip is also a generalized equation which provides molecular basis for the phe- 
nomenological Rouse-Zimm model of the polymer dynamics 311], with a proper account of the variance- 



covariance matrix, the diagonal terms of which correspond to the mean square displacement of each 
atom in equilibrium states. This suggests that the theory can be applied not only to the native 
conformation of protein but also to characterizing the denatured or random-coil state. However, the 
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application requires special care of the ensemble average to evaluate the variance-covariance matrix, 
since the average, by definition, should be taken over virtuahy an infinite number of conformations 
randomly appearing in the solution. Nevertheless, a practical method to evaluate the variance- co- 
variance matrix for the random-coil state of protein can be suggested based on the 3D-RISM/RISM 
theory as follows. First, produce some small number of conformations for protein in water by means of 
a generalized ensemble technique such as the replica-exchange algorithm. Second, evaluate the second 
derivative of the free energy surface of each conformation based on Eq. OV.Sh . and take the average 
of the results over the conformations, which will give rise to the variance-covariance matrix for the 
sampled conformational space. Third, add more conformations to the sample to take the average. Re- 
peat the procedure until the convergence is attained. Our implication is that the convergence will be 
attained rather quickly, because the variance-covariance matrix for each conformation, obtained from 
Eq. pV.Sp . is already an average over a large number of conformations in the free energy surface. The 
converged variance-covariance matrix can be compared with observable quantities which characterize 
a random coil state of protein, such as the gyration radius and the distribution of end-to-end distance. 

V. CONCLUDING REMARKS 

In the present work, we have proposed a new theory of dynamics based on the generalized Langevin 
theory, which can be applied to structural fluctuation around a native state of protein in water. 
The displacement vector of atom positions and their conjugated momentum, are chosen for dynamic 
variables for protein, while the density fields of atoms and their momentum fields are chosen for water. 
Projection of other degrees of freedom onto those dynamic variables using the standard projection 
operator method produced essentially two equations which describe the time evolution of fluctuation 
concerning the density field of solvent and the conformation of protein around an equilibrium state, 
which are correlated each other. 

The equation of motion for protein atoms in water is formally akin to that of the Langevin equation 
for coupled harmonic oscillators in the continuum solvent, examined by Wang and Uhlenbeck long 
time ago. However, there exists a substantial and important difference. Unlike the coupled set of 
harmonic oscillators, the "Hessian" included in the term corresponding to the Hookian-like restoring 
force in the new equation is identified as a variance-covariance matrix of the displacement vector, 
which is nothing but the second moment of the structural fluctuation. Since the fluctuation is taking 
place around the thermodynamic equilibrium, not just around a minimum of a (mechanical) harmonic 
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potential, the "Hessian" matrix should be related to the second derivative of the free energy surface 
of protein, which of course includes the influence of solvent. 

All those findings suggest that wc arc now at the position where we can explore the conformational 
fluctuation around a native state of protein, correlated with the relaxation of water density, since a 
method to evaluate the free energy surface and its first derivative of protein in water has been well 
established already based on the "3D-RISM/RISM" theory. It is not difficult to calculate the second 
derivative of the free energy surface from the first derivatives. 

The finding further suggests even more practical applications related to the drug design. The 3D- 
RISM/RISM theory has been successfully applied to a variety processes of molecular recognition in 
protein, including drug binding. However, so far the application has been limited to a fixed confor- 
mation of protein, which of course cannot take into account the effects of conformational fluctuation, 
such as the induced fitting. With the aid of the linear response theory, the new formulation provides a 
foundation to evaluate the effect of conformational fluctuation in the process of molecular recognition. 
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Appendix A: Calculation of B^, ^(k) 

We show that the correlation BQ^fe(k) = (AR^ 5/?^) vanishes in the homogeneous system. An 
explicit expression for Bq, ^(k) is given by 

B„,b(k) ^ (AR„ 6pi) = ^ fdT, AR, e^'^^'e"''" (A.l) 

where Tc denotes collection of the position variables only, and Zc is the corresponding partition 
function defined as Zc = / dTc exp ( — f3U) with U being the total potential energy of the system. 
Shifting integration variables in (|A.ip as r^* — )• r[' + ARq, and AR/j — > AR/j + ARq. (/3 7^ a), one can 
separate the ARc-integration as follows; 

(AR^c^Pk) - J^^^;;^ JdAK^u-iJdr-^e-P^' ^ ' 

where the potential energy W is obtained from V( with the shift of the variables, and does not involve 
Rq,. We consider the first integration factor in (|A.2p . For simplicity we suppress the index a. Its 
X-component is given by 



Note that this integral vanishes in the thermodynamic limit L — )• 00 due to the oscillating term 
g2k R_ -pj^g second integration factor in ()A.2p remains finite. Therefore we obtain {Xp\r) = in the 
thermodynamic limit. Since this will hold for the other components, we conclude that 

B„,b(k) = (A.4) 

Appendix B: Calculation of (WQ,H^(i)) 

Here we show that the memory matrix (WQ,,H^(f)) vanishes. We consider 

(W,, H^(t)) = (W, e'^^^Ei) = (W, (1 + tQiC + j^ QiCQiC + • • • ) H^) (B.l) 

where 2 = 1 — 7-". For simplicity of notation, we write Wq and from ()III.3ip as 

W„ = F„ + M^^jAR^, = - Cfee(k) 5pi (B.2) 

where the summation is implied for repeated indices, and the matrix M = fc^TL^^ and Cbe(k) = 
ikJM{k)Xd}{^)- 
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We first consider the first term of (jB.ip 

[FJi) - CU^)(F^dpi) + M„^(aR^4) - C,e(k)M,;3(AR^5p|) (B.3) 



We already showed that the last two terms vanish ( ()III.22p and Appendix A). We now show that the 
two terms vanish as well. Let us first consider the second term in (IB.3p . We have 



where the last equality results upon doing the Ra-integration by parts. Obviously we will have the 
same result for the first term, {^Fq-J^) = 0, since does not contain Rq,. Therefore we showed that 

W,H^\ = 0. (B.5) 



Next we consider the second term in (IBH) : (WatQiCSl^ = (Wat{l - 'P)S'Q. We need to 
compute the term VS^. From (jlLSh . one can obtain 

+ N-'x-^ik){5p\Si)5pi + N-\Jad{k){r^Si)ji 

= iV-V;i(k)(5p^k3k)5Pk + A^"'^d(k)(j"-k3k)jk (B.6) 



where the last equality holds since {^Ro3^y = — (^P^H^y = and {^PoH^y = 0. Using (B.6), we have 
QSi = (1 - V)Si = Si- N'\-^{k)(6p\Si)6pi - N-'Ja,{k)(3\Si)3i (B.7) 



It is important to note that QTZ^ only involves the solvent coordinates and momenta. Then, using 
the orthogonality relations (WaSp^) = and ( W^J^) = 0, one obtains 



F„3t) + M„;3(aR^ si) = (B.8) 



where in the last line the first term vanish from the argument shown in (B.4), and we already showed 
the second term vanishes. 

It is now clear that the repeated applications of QiC on will never generate the solute-variable 
components, and hence (^Wa{QiC)"Si) = 0. Therefore we obtain the final result 

Wc.e*2i^H^) = 0. (B.9) 
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